Significantly Improved Prediction of Subcellular Localization by Integrating Text and Protein Sequence Data

نویسندگان

  • Annette Höglund
  • Torsten Blum
  • Scott Brady
  • Pierre Dönnes
  • John San Miguel
  • Matthew Rocheford
  • Oliver Kohlbacher
  • Hagit Shatkay
چکیده

Computational prediction of protein subcellular localization is a challenging problem. Several approaches have been presented during the past few years; some attempt to cover a wide variety of localizations, while others focus on a small number of localizations and on specific organisms. We present a comprehensive system, integrating protein sequence-derived data and text-based information. Itis tested on three large data sets, previously used by leading prediction methods. The results demonstrate that our system performs significantly better than previously reported results, for a wide range of eukaryotic subcellular localizations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SherLoc: high-accuracy prediction of protein subcellular localization by integrating text and protein sequence data

MOTIVATION Knowing the localization of a protein within the cell helps elucidate its role in biological processes, its function and its potential as a drug target. Thus, subcellular localization prediction is an active research area. Numerous localization prediction systems are described in the literature; some focus on specific localizations or organisms, while others attempt to cover a wide r...

متن کامل

Molecular Characterization of the Epstein-Barr Virus BGLF2 Gene, its Expression, and Subcellular Localization

Background: Epstein–Barr virus (EBV) is a universal herpes virus which can cause a life-long and largely asymptomatic infection in the human population. However, the exact pathogenesis of the EBV infection is not well known.Objective: A comprehensive bioinformatics prediction was carried out for investigating the molecular properties of the BGLF2 and to a...

متن کامل

Protein Subcellular Localization Prediction for Fusarium graminearum∗

The fungal pathogen Fusarium graminearum (telomorph Gibberella zeae) is the causal agent of several destructive crop diseases. Investigating subcellular localizations of F. graminearum proteins can provide insight into pathogenic mechanisms underlying F. graminearum-host interactions. In this paper, we design a novel balanced ensemble classifier based on support vector machines (SVMs) to predic...

متن کامل

Improving subcellular localization prediction using text classification and the gene ontology

MOTIVATION Each protein performs its functions within some specific locations in a cell. This subcellular location is important for understanding protein function and for facilitating its purification. There are now many computational techniques for predicting location based on sequence analysis and database information from homologs. A few recent techniques use text from biological abstracts: ...

متن کامل

Comparative in silico analyses of proteins involved in serum resistance as promising vaccine candidates against Acinetobacter baumannii

Introduction: Acinetobacter baumannii as a Gram-negative coccobacillus has become a major cause of hospital-acquired infections. The virulence factors involved in serum resistance are important targets in the development of an effective vaccine against this pathogen. Our aim in this project was in silico analyses of A. baumannii proteins involved in serum resistance which could potentially be u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing

دوره   شماره 

صفحات  -

تاریخ انتشار 2006